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ABSTRACT 

The bilateral filter is a versatile non-linear filter that has found di¬ 
verse applications in image processing, computer vision, computer 
graphics, and computational photography. A common form of the fil¬ 
ter is the Gaussian bilateral filter in which both the spatial and range 
kernels are Gaussian. A direct implementation of this filter requires 
O(cr^) operations per pixel, where a is the standard deviation of 
the spatial Gaussian. In this paper, we propose an accurate approx¬ 
imation algorithm that can cut down the computational complexity 
to 0(1) per pixel for any arbitrary a (constant-time implementation). 
This is based on the observation that the range kernel operates via the 
translations of a fixed Gaussian over the range space, and that these 
translated Gaussians can be accurately approximated using the so- 
called Gauss-polynomials. The overall algorithm emerging from this 
approximation involves a series of spatial Gaussian filtering, which 
can be efficiently implemented (in parallel) using separability and 
recursion. We present some preliminary results to demonstrate that 
the proposed algorithm compares favorably with some of the exist¬ 
ing fast algorithms in terms of speed and accuracy. 

Index Terms — Bilateral filter, approximation. Gauss-polynomial, 
convolution, fast algorithm. 

1. INTRODUCTION 

The bilateral filter of Tomasi and Maduchi m is a particular instance 
of an edge-preserving smoothing filter. The origins of the filter can 
be traced back to the work of Lee 12! and Yaroslavsky 0. The SU¬ 
SAN framework of Smith and Brady 1^ is also based on a similar 
idea. The relation between the bilateral and other closely related fil¬ 
ters is surveyed in 0. The bilateral filter has turned out to be a versa¬ 
tile tool that has found widespread applications in image processing, 
computer graphics, computer vision, and computational photogra¬ 
phy. A detailed survey of some of these applications can be found in 
0. More recently, the bilateral filter has received renewed attention 
in the context of image denoising GH. The original bilateral filter 
m has a straightforward extension to signals of arbitrary dimension 
and, in particular, to video and volume data 0. Thus, while we will 
limit our discussion to images in this paper, the ideas that we present 
next can also be extended to higher-dimensional signals. 

Consider a discrete image {/(i) : i G /} where / is some finite 
rectangular domain of Z^. The Gaussian bilateral filtering of this 
image is given by 

, . _ Eien (j) (/(i - j) -/(i))/(i - j) 
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where both the spatial and range kernels are Gaussian, 

S<T,(i) = exp and = exp . (2) 

In practice, the domain of the spatial kernel Q is restricted to some 
neighbourhood of the origin. Typically, Q is a square neighbourhood, 
VL^[-W,W]x [-IL, VL] where IL = 3cr, HI. We refer the reader 
to w for a detailed exposition on the working of the filter. 

1.1. Fast Bilateral Filter 

It is clear that a direct implementation of H]) requires 0{as) oper¬ 
ations per pixel. In general, the directly computed bilateral filter 
is slow for practical settings of cts Ifi). To address this issue, re¬ 
searchers have come up with several fast algorithms liiioiiniiiii 
dlHIISlIIel that are based on some form of approximation and 
yield various levels of speed and accuracy. We refer the interested 
reader to 01151 for a survey of algorithms for fast bilateral filter¬ 
ing. The ultimate goal in this regard is to reduce the complexity to 
0(1) per pixel, that is, the run time of the implementation should 
not depend on cr^. This is commonly referred to as a constant-time 
implementation. The constant-time algorithms in lIT^ \T6l [TTl \TSi 
are particularly relevant to the present work. The authors here pro¬ 
ceed by approximating the Gaussian range kernel using polynomial 
and trigonometric functions, and demonstrate how the bilateral filter 
can be decomposed into a series of spatial Gaussian filters as result. 
Now, since a Gaussian filter can be implemented in constant-time 
using separability and recursion HU, the overall approximation can 
therefore be computed in constant-time. 

1.2. Present Contribution 

In this paper, we propose a fast 0(1) algorithm for computing ([T]) 
which is motivated by the line of work in (Eiiiiiniiiii. In particu¬ 
lar, we present a novel approximation for the range term in m that al¬ 
lows us to decompose the bilateral filter into a series of fast Gaussian 
convolutions. The fundamental difference between the above papers 
and the present approach is that instead of approximating the Gaus¬ 
sian and then translating the approximation, we directly approximate 
the translated Gaussians using the so-called Gauss-polynomials. The 
advantages of the proposed approximation are the following: 

• It is generally much more accurate than the polynomial ap¬ 
proximation in ca. 

• For a fixed approximation degree (to be defined shortly), it 
leads to exactly half the number of Gaussian filterings than that re¬ 
quired by the approximation in |[T6l[T8l, and hence has a smaller run 
time. 

• It does not involve transcendental functions such as cos(ct;x) 
and sm{ux) which are used in lITfil [TSl . It only involves polyno¬ 
mials (and just a single Gaussian) and hence can be efficiently im- 




plemented on hardware This is partly what motivated the 

present work. 

Moreover, we also show how the proposed approximation can be 
improved by first centering the range data, then applying the approxi¬ 
mation algorithm, and finally adding back the centre to the processed 
range data. 


2. GAUSS-POLYNOMIAL DECOMPOSITION 

The main idea in (TJ] was to approximate the range kernel in 
© using appropriate polynomials and trigonometric functions. By 
using these approximations in place of the Gaussian kernel, it was 
shown that the numerator and denominator of Q can be approxi¬ 
mated using a series of Gaussian filtering. 

The present idea is to consider the translates of the range kernel 
gar(t — that appear in ([T]), where t = /(i — j) and r = /(i) take 
values in some intensity range, say, [L,U]. For example, L = 0 and 
U = 255 for an 8-bit grayscale image. We can write 

-r) = exp exp exp . (3) 

For a fixed translation r, this is a function of t. Notice that the first 
term is simply a scaling factor, while the second term is a Gaussian 
centered at the origin. In fact, the second term essentially contributes 
to the “bell” shape of the translated Gaussian. The third term is a 
monotonic exponential function which is increasing or decreasing 
depending on the sign of r. This term helps in translating the Gaus¬ 
sian to t = T. The decomposition in m plays an important role in 
the rest of the discussion and is illustrated in Figure [T] 



(a) Gaussian centered at r = 100. 




(b) Gaussian centered at r = 0. (c) Exponential function. 

Fig. 1. An instance of the decomposition in © corresponding to 
T = 100 and ar = 30. We used L = 0 and U = 255 corresponding 
to the intensity range of an 8-bit grayscale image. Up to a scaling 
factor of about 4e-3, (a) is a product of (b) and (c). 


Consider the Taylor series of the exponential function, 

y at 1 ^ 

exp ( — I = —r ( ^ ) + higher-order terms. (4) 

\(Tr ) n\ \GiJ 

By dropping the higher-order terms, we obtain the following approx¬ 
imation of ©: 



Being the product of a Gaussian and a polynomial of degree N, we 
will henceforth refer to © as a “Gauss-polynomial” of degree N. 

At this point, we note that one of the proposals in oa was to 
approximate ga^. {t — t) using its Taylor polynomial. The funda¬ 
mental difference with our approach is that instead of approximating 
the entire Gaussian, we approximate one of its component, namely 
the monotonic exponential in ©. The intuition behind this is that a 
polynomial eventually goes to infinity as one moves away from the 
origin. This makes it difficult to approximate the Gaussian function 
on its asymptotically-decaying tails. As against this, the exponen¬ 
tial function in © is monotonic and hence can be more accurately 
approximated using polynomials. This is explained with an exam¬ 
ple in Figure m In particular, notice in Figure [2|(b) that the Gauss- 
polynomial approximation is fairly accurate over the entire range of 
interest and is comparable to the raised-cosine approximation US). 



Fig. 2. Approximation of gcrrit — 't) where ar — 30 and r = 10. 
The raised-cosine GH, the Taylor polynomial 113, and the Gauss- 
polynomial have the same degree N = 10. In subfigure (a), notice 
how the Taylor polynomial quickly goes off to +oo as one moves 
away from the origin. For this reason, we restricted the plot to the 
interval [0,170], although the desired approximation range is the full 
dynamic range [0,255]. The approximation over [0,255] for the 
raised-cosine and the Gauss-polynomial are separately provided in 
subfigure (b). 


We note that for a fixed degree N, the accuracy of the Gauss- 
polynomial approximation in (|5j depends on r. Indeed, when r = 0, 
there is nothing to approximate since the exponential function re¬ 
duces to a constant in this case. On the other hand, we see from 
that the magnitude of the higher-order terms increases with in¬ 
crease in |t|, and the approximation accuracy drops as a result. This 
is demonstrated with an example in Figure [3] (a). Of course, the ap¬ 
proximation can be improved by using Gauss-polynomials of higher 
degree. However, we note that for a fixed degree, the approximation 
accuracy can be improved simply by reducing the maximum |t| that 
appears in We propose the following “centering” trick in which 





















where 




(b) 


Fig. 3. (a) Approximation of gar{t — r),ar = 30, over [0, 255] us¬ 
ing Gauss-polynomials of degree 20. We vary r over 0,10, 50,120. 
(b) Approximation over the centered interval [—255/2,255/2], 
where we vary r over —100, —50, 0,10, 50,100. This demonstrates 
how the approximation accuracy can be improved by simply center¬ 
ing the range interval at the origin. 


we set, for example, 

tc = mean {/(i) : i /(O- 

We next translate each pixel intensity by tc, which has the effect of 
centering the transformed intensity range at the origin. For example, 
when L = 0 and U = 255, the maximum |t| equals 255. However, 
if we center the intensity range [0, 255], say at tc = 100, then we can 
effectively reduce the maximum \r\ to 155. The Gauss-polynomial 
approximations obtained after the centering are shown in Figure [3 
(b). The above idea of centering is compatible with the bilateral filter 
precisely because of the following property of the bilateral filter. 

Proposition 2.1 Ifh{i) = /(i) — tc, then 

/bf(i) = /iBF(l) + tc (i G /). (6) 

This is a simple consequence of the fact that the range kernel depends 
only on the intensity difference, and that for a fixed range term, Cl) 
preserves constant functions. In other words, we can first centre the 
intensity range, apply the bilateral filter, and finally add back the 
centre to the output. 


3. FAST BILATERAL FILTER 

We now present the constant-time implementation of ([B using 
Gauss-polynomials. Suppose that N is the degree of the polynomial 
in For n = 0,..., iV + 1, define the images 

Gn(i) = and F„(i) = exp G„(i). (7) 

Denote the Gaussian filtering of Fn(i) by Fn{i), that is, 

Fn{}) — (-Fn * 5'o-s ) (l) = 5'o-s (j)Fn(l — j). (8) 

jen 


Substituting t — f{i — j) and r = /(i), and using the Gauss- 
polynomial approximation in place of ga^ {t — it can be veri¬ 
fied that (after interchanging summations) we can express the numer¬ 
ator of Q as 



p{^), 


1 

P(i) = a.V-G„(i)Fn+i(i). 

n\ 

n=0 

Similarly, we can express the denominator of 0 as 


exp 


2cr2 


Q(l), 


where 


^ 1 


In other words, we can approximate ©by 
/bfW = P(i)/Q(i). 


( 9 ) 


( 10 ) 


( 11 ) 


Note that we have effectively transferred the non-linearity of the bi¬ 
lateral filter to the intermediate images in 0, which are obtained 
from the input image using pointwise non-linear transforms. The 
main leverage that we get from the above manipulation is that, for 
any arbitrary Us, (0 can be computed using 0(1) operations per 
pixel GH. The overall cost of computing CB is therefore 0(1) per 
pixel. In other words, we have a constant-time approximation of the 
bilateral filter. In this regard, we note that the above analysis holds if 
we replace the spatial Gaussian filter by any other filter (e.g., a box 
filter) that has a constant-time implementation. 
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Data: Image {/(i) : i G /}, and parameters as,<Jr, N. 
Result: Approximation {/bf(i) : i G /}. 
tc = mean {/(i) : i G /}; 

H^) = /(i) - tc, 

G(i) = 1; 

P(i) = exp(—ft(i)^/2(Tr); 

P(i) = 0; 

Q(i) = 0; 

H{i) = h{i)/ar\ 
c = 1; 

forn = 0,1,..., do 

Q(i) = Q(i) + c.G(i)P(i); 

F(i) = B(i)F(i); 

F(i) = (F * (i); 

F(i) = F(i) + c-G(i)F(i); 

G(i) = P(i)G(i); 
c = c/(n + 1); 

end 


18 /bf(i) = cr^ (P(l)/Q(l)) + F; 


Algorithm I: Gauss-Polynomial Bilateral Filter (GPF). 


The overall algorithm is summarized in Algorithm [T] We will 
henceforth refer to this as the Gauss-Polynomial-based Bilateral Fil¬ 
ter (GPF). Notice that we use centering and (|6]) to improve the accu¬ 
racy. Moreover, we efficiently implement steps ©toCOj. In partic¬ 
ular, we recursively compute the images in m and the factorials in 
(|9l) and fTOl) . Notice that steps 2-7,11-12, 14,15, and 18 are applied 
to each pixel (cheap pointwise operations). To avoid confusion, we 
note that the specification of the some of quantities in Algorithm [T] 
are somewhat different from the corresponding definitions in 0 - 

dlB. 

It is clear that the main computations in GPF are the Gaussian 
filterings in step 13 (and the initial filtering in step 8). That is, the 
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overall cost is dominated by the cost of computing + 1 Gaussian 
filterings. In this regard, we note that for the same degree N, the 
number of Gaussian filterings required in ifTbl is 4(A^ + 1). Indeed, 
we will see in Section |4| for a fixed N, the overall run-time of GPF 
is about a third of that of fT6l . Finally, we note that GPF involves 
the evaluation of a transcendental function just once in step 4. Thus, 
GPF is better suited to hardware implementation 12011211 compared 
to the algorithm in fT6l which involves the repeated evaluation of 
cosine and sine functions. 

4. EXPERIMENTS 

We now present some results concerning the accuracy and run-time 
of the proposed GPF algorithm. In particular, we compare it with 
some of the fast algorithms ifTOl [T^ [TSl ITSl . The experiments were 
performed using Matlab on an Intel quad-core 2.7 GHz machine with 
8 GB memory. We implemented the Gaussian filtering in GPF and 
(TSlIIll using Deriche’s constant-time algorithm (HI- The average 
run-times of the various fast algorithms are reported in Table [T] for a 
256 X 256. We do not redundantly report the run-times for different 
image sizes, since this can roughly be estimated from the run-times 
in Table [T] (the algorithms scale linearly with the number of pixels). 
Notice that the run time of GPF and ifTSlfTSl does not change appre¬ 
ciably with as . To evaluate the accuracy, we also report the mean- 
squared-error (MSB) between the exact implementation of Q and 
the result obtained using the fast algorithms in Table \T\ In partic¬ 
ular, the MSB between two images /(i) and p(i) is defined to be 
10 logio(MSE) dB, where MSE = (1/|7|) E. 67 (/(i) “ No- 

tice that GPF is competitive with the existing algorithms in terms of 
accuracy and run-time. In particular, GPF has the smallest run-time, 
and its MSB is in general better than the rest of the algorithms and 
comparable to that of the raised-cosine-based approximation in ca. 
The degree of the raised-cosine and the Gauss-polynomial filter is 20 
for all the experiments (this gives a good tradeoff between accuracy 
and run-time). In this regard, an open question is how the accuracy of 
GPF varies with the degree, as a function of as and cr^. This will be 
addressed in future work. Note that the run-time of the polynomial 
approximation in (El is almost identical to that of the proposed algo¬ 
rithm and is hence not reported. Bor a visual comparison, we report 
the result obtain on the Peppers image in Bigure|4| Notice the visi¬ 
ble distortions in subfigure (d) which arises on account of the poor 
approximation of the Gaussian kernel using Taylor polynomials. 

5. CONCLUSION 

We presented a fast algorithm for bilateral filtering based on Gauss- 
polynomial decompositions of the translations of the range kernel. 
We presented some preliminary results which demonstrated the ac¬ 
curacy and speed of the algorithm in comparison to some of the ex¬ 
isting fast algorithms for bilateral filtering. In particular, we saw that 
the algorithm is comparable to the one in (m, but has a smaller run 
time (about a third). Moreover, as remarked in the introduction, the 
proposed algorithm has an edge over ca in the context of hardware 
implementation - it is based on polynomials and does not involve the 
computation of multiple trigonometric functions (2n . We note that 
the algorithm has a direct extension to other variants of the bilateral 
filter including the joint and guided filter |[22l[^ , and can also be 
extended for handling volume and video data US). 


Table 1. The top table comparison of the average run-time of the 
exact, the proposed, and the fast algorithms in fTOl [TSl \TEi for dif¬ 
ferent values of as and fixed ar = 30. We used the Peppers image 
shown in FigurelH The settings suggested in fTOllTSlfTSl were used 
for the corresponding implementations. The algorithms were imple¬ 
mented using Matlab on an Intel quad-core 2.7 GHz machine with 8 
GB memory. The bottom table compares the MSB between the exact 
implementation of o and the respective algorithms. 
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